**STL** is short for **standard template library**. The STL is C++'s standard collection of general-purpose classes and functions that implement many basic algorithms and data structures. Two of the main goals of the STL are *efficiency* and *generality*. All the STL algorithms are implemented as efficiently as possible, and they are well-tested and generally bug-free. They are also implemented in a very general way, so that they typically work with any data structure (i.e. arrays, vectors, strings, lists, files, etc.) that it makes sense for them to work on. Making a library of algorithms that is both efficient and general-purpose is non-trivial: most other languages settle for one or the other. In this note, we will look at some basic STL algorithms: `find`, `sort`, and `reverse`. We will also introduce **type parameters** and **generic programming**, two ideas the STL uses extensively. ## Linear Search: find `find` is the STL's version of [[linear search]]. The basic version of `find` has this header: ```cpp template<class InputIt, class T> constexpr InputIt find(InputIt first, InputIt last, const T& value) ``` Lets look at each part: - `template<class InputIt, class T>` indicates that `InputIt` and `T` are **type variables**. `T` is a variable representing the type of the value we are looking for, and it can be any type that works with `operator==`. `InputIt` is a type variable representing an [[iterators|iterator]] is an object that "points" to another object. - `constexpr InputIt` is the type of the value that `find` returns. `constexpr` means that `find` can be called at compile-time inside of a [[constant expression]]. We're not covering `const` expressions in this course, but they let you do computations at compile-time, which can greatly speed up some programs. `find` returns an `InputIt`, i.e. `find` returns an [[iterators|iterator]]. An [[iterators|iterator]] points to a another value, and so the returned [[iterators|iterator]] points to the value. If the value was *not* found, then the returned [[iterators|iterator]] is equal to `last` (i.e. 1 past the last value in the range being searched). - `find(InputIt first, InputIt last, const T& value)` means that `find` takes two [[iterators]], `first` and `last`, and a value to search for called `value`. `first` points to the first element of the range being searched, and `last` points to *one past* the end element of the same range. This asymmetry between `first` and `last` is important: if `find` returns `last`, then that means `value` was not found. The documentation for C++ gives a suggested implementation of `find`: ```cpp template<class InputIt, class T> constexpr InputIt find(InputIt first, InputIt last, const T& value) { for (; first != last; ++first) { if (*first == value) { return first; } } return last; } ``` This is a standard pointer-oriented implementation of linear search. Notice that `*first` is used when comparing against `value`. `first` is an [[iterators|iterator]], and so `*first` is the value that `first` points to. This is, intentionally, the same way that regular pointers work, and so it's often useful to think of [[iterators]] as pointers. `find` is very general, and works efficiently with pretty much every type of container where it makes sense to use [[linear search]]. Here are some examples of using `find`: ```cpp void test1() { vector<string> food = {"kale", "lettuce", "onion", "carrot", "potato", "tomato"}; if (find(begin(food), end(food), "carrot") == end(food)) { cout << "carrot NOT found\n"; } else { cout << "carrot found\n"; } if (find(begin(food), end(food), "squash") == end(food)) { cout << "squash NOT found\n"; } else { cout << "squash found\n"; } double costs[] = {2.00, 3.25, 0.99, 7.40, 2.99}; if (find(begin(costs), end(costs), 2.99) == end(costs)) { cout << "2.99 NOT found\n"; } else { cout << "2.99 found\n"; } if (find(begin(costs), end(costs), 3.99) == end(costs)) { cout << "3.99 NOT found\n"; } else { cout << "3.99 found\n"; } string quote = "No matter where you go, there you are."; if (find(begin(quote), end(quote), 'e') == end(quote)) { cout << "'e' NOT found\n"; } else { cout << "'e' found\n"; } if (find(begin(quote), end(quote), 'q') == end(quote)) { cout << "'q' NOT found\n"; } else { cout << "'q' found\n"; } } ``` ## Generalizing test1 with Templates `test1()` has a lot of repeated code which makes it hard to read. So here we create a helper function called `test_find` that simplifies things quite a bit. `test_find` is a [[generic function]], or [[generic function|templated function]] because it uses two type variables, `InputIt` and `T`. The type variables let `test_find` work with *any* data that we can pass to `find`. This kind of generality is one of the major advantages of the STL, and well-used it can result in code that is both efficient and highly re-usable. ```cpp template<class InputIt, class T> void test_find(InputIt first, InputIt last, const T& value) { if (find(first, last, value) == last) { cout << value << " NOT found\n"; } else { cout << value << " found\n"; } } void test2() { vector<string> food = {"kale", "lettuce", "onion", "carrot", "potato", "tomato"}; test_find(begin(food), end(food), "carrot"); test_find(begin(food), end(food), "squash"); double costs[] = {2.00, 3.25, 0.99, 7.40, 2.99}; test_find(begin(costs), end(costs), 2.99); test_find(begin(costs), end(costs), 3.99); string quote = "No matter where you go, there you are."; test_find(begin(quote), end(quote), 'e'); test_find(begin(quote), end(quote), 'q'); } ``` Notice how much simpler and easier to read `test2()` is thanks to `test_find`. Plus the body of `test_find` itself is short and easy to understand. ## Sorting and Reversing Sorting is one of the most common algorithms in all of programming, and so the STL provides an efficient sorting function called `std::sort`. `std::sort` is typically implemented using a fast general-purpose sort, such as variations of [[mergesort]] or [[quicksort]]. The C++ documentation doesn't specify exactly how `std:sort` is implemented, it just says it must do no more than $O(n \log n)$ comparisons in the worst case. The `reverse` function reverses the elements in a given range. Calling `sort` followed by `reverse` is a common way of putting a range into reverse sorted order. Here is some sample code: ```cpp void test3() { vector<string> food = {"kale", "lettuce", "onion", "carrot", "potato", "tomato"}; sort(begin(food), end(food)); for(string s : food) cout << s << " "; cout << "\n"; reverse(begin(food), end(food)); for(string s : food) cout << s << " "; cout << "\n"; double costs[] = {2.00, 3.25, 0.99, 7.40, 2.99}; sort(begin(costs), end(costs)); for(double c : costs) cout << c << " "; cout << "\n"; reverse(begin(costs), end(costs)); for(double c : costs) cout << c << " "; cout << "\n"; string quote = "No matter where you go, there you are."; sort(begin(quote), end(quote)); cout << quote << "\n"; reverse(begin(quote), end(quote)); cout << quote << "\n"; } ``` ## Generalizing test3 As we did with `test_find`, lets write a generic `test_sort` function to simplify `test3`. Since `test_sort` is only passed [[iterators]] to the first and last items in the range, we can't print the values in the same way as in `test3`. So we write `println`, a generic function that prints all the values in the range specified by the `first` and `last` [[iterators]]. The syntax is almost the same as what we would write if we were using regular pointers: `p` points to each element in the range in succession, and `*p` is the value that `p` points to: ```cpp template<class InputIt> void println(InputIt first, InputIt last) { for(InputIt p = first; p != last; p++) { cout << *p << " "; } cout << "\n"; } template<class InputIt> void test_sort(InputIt first, InputIt last) { sort(first, last); println(first, last); reverse(first, last); println(first, last); } ``` Unlike `test_find`, `test_sort` has only one type variable, `InputIt`, because it does *not* need a value to search for. Here is `test4`: ```cpp void test4() { vector<string> food = {"kale", "lettuce", "onion", "carrot", "potato", "tomato"}; test_sort(begin(food), end(food)); double costs[] = {2.00, 3.25, 0.99, 7.40, 2.99}; test_sort(begin(costs), end(costs)); string quote = "No matter where you go, there you are."; test_sort(begin(quote), end(quote)); } ``` Again, notice how simple and general-purpose the resulting code is. `test4()` is much easier to read than `test3()`. The essential structure of `test_sort` is quite readable, and works with any type of range that `std::sort` can sort. See [stl_demo.cpp](https://csil-git1.cs.surrey.sfu.ca/cmpt135/cmpt135_fall2020/cmpt-135-notes/-/tree/master/stl_demonstration) for all the sample source code used in this note. ### Example: Sorting with Different Comparison Functions The follow code demonstrates how `sort` can be used to sort records by different fields. This is quite useful in practice. ```cpp // sort_example.cpp #include "cmpt_error.h" #include <iostream> #include <vector> #include <algorithm> #include <string> using namespace std; struct Person { string first_name; string last_name; int age; }; ostream& operator<<(ostream& os, const Person& p) { os << "{" << p.first_name << ", " << p.last_name << ", " << p.age << "}"; return os; } // Person record could be sorted by first_name, last_name, or age. We make one // sorting comparison function for each possibility. bool by_first_name(const Person& a, const Person& b) { return a.first_name <= b.first_name; } bool by_last_name(const Person& a, const Person& b) { return a.last_name <= b.last_name; } bool by_age(const Person& a, const Person& b) { return a.age <= b.age; } // This comparison function is used to sort Person objects first by last name, // and then by first name. bool by_last_then_first(const Person& a, const Person& b) { if (a.last_name == b.last_name) { return by_first_name(a, b); } else { return by_last_name(a, b); } } int main() { vector<Person> data = { {"Sue", "Zee", 32}, {"Richard", "Pressman", 40}, {"Raine", "Storm", 32}, {"Carl", "Storm", 32}, {"Mimi", "Yuyu", 44}, {"Max", "Power", 40}, {"Sue", "Storm", 30} }; cout << "By first_name:\n"; sort(begin(data), end(data), by_first_name); for(const Person& p : data) cout << p << "\n"; cout << "\n"; cout << "By first_name, reverse alphabetical:\n"; sort(begin(data), end(data), by_first_name); reverse(begin(data), end(data)); for(const Person& p : data) cout << p << "\n"; cout << "\n"; cout << "By last_name:\n"; sort(begin(data), end(data), by_last_name); for(const Person& p : data) cout << p << "\n"; cout << "\n"; cout << "By age, lowest to highest:\n"; sort(begin(data), end(data), by_age); for(const Person& p : data) cout << p << "\n"; cout << "\n"; cout << "By age, highest to lowest:\n"; sort(begin(data), end(data), by_age); reverse(begin(data), end(data)); for(const Person& p : data) cout << p << "\n"; cout << "\n"; cout << "By last name, then first name:\n"; sort(begin(data), end(data), by_last_then_first); for(const Person& p : data) cout << p << "\n"; } // main ``` ## Practice Problems 1. What does STL stand for? 2. What are two of the main design goals of the STL? 3. What is a generic function? How does it differ from a regular function? 4. What are iterators? How do they relate to pointers? 5. In your own words, explain how the standard `find` function works. What does it return if it can't find the value it's looking for? 6. Rewrite the implementation of `find` given in the notes using a while-loop in place of the for-loop. 7. Implement your own function called `reverse_sort` that efficiently sorts any sequence in reverse sorted order. Use the STL `sort` and `reverse` functions. 8. Suppose you have a function called `iter_swap(p, q)` that takes two iterators, `p` and `q`, as input and exchanges the values they point to. Use that function to help implement your own version of the general STL `reverse` function.